Detecting Erroneous Uses of Complex Postpositions in an Agglutinative Language

نویسندگان

  • Arantza Díaz de Ilarraza
  • Koldo Gojenola
  • Maite Oronoz
چکیده

This work presents the development of a system that detects incorrect uses of complex postpositions in Basque, an agglutinative language. Error detection in complex postpositions is interesting because: 1) the context of detection is limited to a few words; 2) it implies the interaction of multiple levels of linguistic processing (morphology, syntax and semantics). So, the system must deal with problems ranging from tokenization and ambiguity to syntactic agreement and examination of local contexts. The evaluation was performed in order to test both incorrect uses of postpositions and also false alarms. 1 Structure of complex postpositions Basque postpositions play a role similar to English prepositions, with the difference that they appear at the end of noun phrases or postpositional phrases. They are defined as “forms that represent grammatical relations among phrases appearing in a sentence” (Euskaltzaindia, 1994). There are two main types of postpositions in Basque: (1) a suffix appended to a lemma and, (2) a suffix followed by a lemma (main element) that can also be inflected. (1) etxe-tik house-(from the) from the house (2) etxe-aren gain-etik house-(of the) top-(from the) from the top of the house The last type of elements has been termed as complex postposition. We will use this term to name the whole sequence of two words involved, and not just to refer to the second element. Com© 2008. Licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported license (http://creativecommons.org/ licenses/by-nc-sa/3.0/). Some rights reserved. plex postpositions can be described as: (3) lemma1 + (suffix1 + lemma2 + suffix2) In these constructions, the second lemma is fixed for each postposition, while the first lemma allows for much more variation, ranging from every noun to some specific semantic classes. The above description (3) is intended to stress (with parentheses) the fact that the combination of both suffixes with the second lemma acts as a complex case-suffix that is “appended” to the first lemma. Both suffixes present different combinations of number and case, which can agree in several ways, depending on the lemma, case or contextual factors. Table 1 shows the different variants of two complex postpositions, derived from the lemmas bitarte and aurre. For example, the lemma bitarte is polysemous (“means, by means of, instrument, while (temporal), between”). Multiple factors affect the correctness of a postposition, including morphological and syntactic constraints. We also discovered a number of relevant contextual factors, which are not explicitly accounted for in standard grammars.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Production of Nominal and Verbal Inflection in an Agglutinative Language: Evidence from Hungarian

The contrast between regular and irregular inflectional morphology has been useful in investigating the functional and neural architecture of language. However, most studies have examined the regular/irregular distinction in non-agglutinative Indo-European languages (primarily English) with relatively simple morphology. Additionally, the majority of research has focused on verbal rather than no...

متن کامل

Transferring Syntactic Relations of Subject-Verb-Object Pattern in Chinese-to-Korean SMT

Since most Korean postpositions signal grammatical functions such as syntactic relations, generation of incorrect Korean postpositions results in producing ungrammatical outputs in machine translations targeting Korean. Chinese and Korean belong to morphosyntactically divergent language pairs, and usually Korean postpositions do not have their counterparts in Chinese. In this paper, we propose ...

متن کامل

Runtime Checking of Datatype Signatures in MPI

The MPI standard provides a way to send and receive complex combinations of datatypes (e.g., integers and doubles) with a single communication operation. The MPI standard specifies that the type signature, that is, the basic datatypes (language-defined types such as int or DOUBLE PRECISION), must match in communication operations such as send/receive or broadcast. Because datatypes may be defin...

متن کامل

Crosslinguistic Computation and a Rhythm-based Classification

The classification of languages is an old issue but is most commonly guided by a genetic and/or areal perspective, or, when guided by the perspective of structural relationships, attempts for divisions on separate levels of linguistic description such as morphology or syntax. Our study, however, contributes to the alternative but “hopeful” program (Plank 1998) of a holistic typology relating ph...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008